The Geometry of Kernelized Spectral Clustering
نویسندگان
چکیده
Clustering of data sets is a standard problem in many areas of science and engineering. The method of spectral clustering is based on embedding the data set using a kernel function, and using the top eigenvectors of the normalized Laplacian to recover the connected components. We study the performance of spectral clustering in recovering the latent labels of i.i.d. samples from a finite mixture of nonparametric distributions. The difficulty of this label recovery problem depends on the overlap between mixture components and how easily a mixture component is divided into two non-overlapping components. When the overlap is small compared to the indivisibility of the mixture components, the principal eigenspace of the populationlevel normalized Laplacian operator is approximately spanned by the square-root kernelized component densities. In the finite sample setting, and under the same assumption, embedded samples from different components are approximately orthogonal with high probability when the sample size is large. As a corollary we control the fraction of samples mislabeled by spectral clustering under finite mixtures with nonparametric components.
منابع مشابه
The Geometry of Kernelized Spectral Clustering by Geoffrey Schiebinger1,
Clustering of data sets is a standard problem in many areas of science and engineering. The method of spectral clustering is based on embedding the data set using a kernel function, and using the top eigenvectors of the normalized Laplacian to recover the connected components. We study the performance of spectral clustering in recovering the latent labels of i.i.d. samples from a finite mixture...
متن کاملTowards Finding a New Kernelized Fuzzy C-means Clustering Algorithm
Kernelized Fuzzy C-Means clustering technique is an attempt to improve the performance of the conventional Fuzzy C-Means clustering technique. Recently this technique where a kernel-induced distance function is used as a similarity measure instead of a Euclidean distance which is used in the conventional Fuzzy C-Means clustering technique, has earned popularity among research community. Like th...
متن کاملReview and Comparison of Kernel Based Fuzzy Image Segmentation Techniques
This paper presents a detailed study and comparison of some Kernelized Fuzzy C-means Clustering based image segmentation algorithms Four algorithms have been used Fuzzy Clustering, Fuzzy CMeans(FCM) algorithm, Kernel Fuzzy CMeans(KFCM), Intuitionistic Kernelized Fuzzy CMeans(KIFCM), Kernelized Type-II Fuzzy CMeans(KT2FCM).The four algorithms are studied and analyzed both quantitatively and qual...
متن کاملA New Kernelized Fuzzy C-Means Clustering Algorithm with Enhanced Performance
Recently Kernelized Fuzzy C-Means clustering technique where a kernel-induced distance function is used as a similarity measure instead of a Euclidean distance which is used in the conventional Fuzzy C-Means clustering technique, has earned popularity among research community. Like the conventional Fuzzy C-Means clustering technique this technique also suffers from inconsistency in its performa...
متن کاملKernel methods in computer vision: object localization, clustering, and taxonomy discovery
In this thesis we address three fundamental problems in computer vision using kernel methods. We first address the problem of object localization, which we frame as the problem of predicting a bounding box around an object of interest. We develop a framework in Chapter II for applying a branch and bound optimization strategy to efficiently and optimally detect a bounding box that maximizes obje...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014